Discovering Video Clusters from Visual Features and Noisy Tags
نویسندگان
چکیده
We present an algorithm for automatically clustering tagged videos. Collections of tagged videos are commonplace, however, it is not trivial to discover video clusters therein. Direct methods that operate on visual features ignore the regularly available, valuable source of tag information. Solely clustering videos on these tags is error-prone since the tags are typically noisy. To address these problems, we develop a structured model that considers the interaction between visual features, video tags and video clusters. We model tags from visual features, and correct noisy tags by checking visual appearance consistency. In the end, videos are clustered from the refined tags as well as the visual features. We learn the clustering through a max-margin framework, and demonstrate empirically that this algorithm can produce more accurate clustering results than baseline methods based on tags or visual features, or both. Further, qualitative results verify that the clustering results can discover sub-categories and more specific instances of a given video category.
منابع مشابه
Discovering Visual Concept Structure with Sparse and Incomplete Tags
Discovering automatically the semantic structure of tagged visual data (e.g. web videos and images) is important for visual data analysis and interpretation, enabling the machine intelligence for effectively processing the fast-growing amount of multi-media data. However, this is non-trivial due to the need for jointly learning underlying correlations between heterogeneous visual and tag data. ...
متن کاملRecognition of Visual Events using Spatio-Temporal Information of the Video Signal
Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...
متن کاملTags Re-ranking Using Multi-level Features in Automatic Image Annotation
Automatic image annotation is a process in which computer systems automatically assign the textual tags related with visual content to a query image. In most cases, inappropriate tags generated by the users as well as the images without any tags among the challenges available in this field have a negative effect on the query's result. In this paper, a new method is presented for automatic image...
متن کاملImage Tag Completion by Noisy Matrix Recovery
It is now generally recognized that user-provided image tags are incomplete and noisy. In this study, we focus on the problem of tag completion that aims to simultaneously enrich the missing tags and remove noisy tags. The novel component of the proposed framework is a noisy matrix recovery algorithm. It assumes that the observed tags are independently sampled from an unknown tag matrix and our...
متن کاملUnsupervised Video Tag Correction System
We present a new system for video auto tagging which aims at correcting and completing the tags provided by users for videos uploaded on the Internet. Unlike most existing systems, we do not learn any tag classifiers or use the questionable textual information to compare our videos. We propose to compare directly the visual content of the videos described by different sets of features such as B...
متن کامل